Development and comparison of two approaches for visual speech analysis with application to voice activity detection

نویسندگان

Bertrand Rivet

Andrew J. Aubrey

Laurent Girin

Yulia Hicks

Christian Jutten

Jonathon A. Chambers

چکیده

In this paper we present two novel methods for visual voice activity detection (V-VAD) which exploit the bimodality of speech (i.e. the coherence between speaker’s lips and the resulting speech). The first method uses appearance parameters of a speaker’s lips, obtained from an active appearance model (AAM). An HMM then dynamically models the change in appearance over time. The second method uses a retinal filter on the region of the lips to extract the required parameter. A corpus of a single speaker is applied to each method in turn, where each method is used to classify voice activity as speech or non speech. The efficiency of each method is evaluated individually using receiver operating characteristics and their respective performances are then compared and discussed. Both methods achieve a high correct silence detection rate for a small false detection rate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering

Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...

متن کامل

Speech and Reading Disorders Screening, and Problems in Structure and Function of Articulation Organs in Children in Mashhad City, Iran

Background and Objectives: Investigating the prevalence of speech and language disorders and the contributing factors can help determine the best treatment options suited to the needs of these patients. So far, no comprehensive study has been conducted on screening speech and reading disorders and problems in the structure and function of articulation organs (PSFAOs) in children in Mashhad City...

متن کامل

Improving Voice Outcomes After Injury to the Recurrent Laryngeal Nerve

Objectives: The present study aimed to determine the voice outcomes before and after the administration of voice therapy in patients who suffered an injury to the recurrent laryngeal nerve after undergoing thyroidectomy. Methods: The sample consisted of 26 patients (2 males and 24 females) aged between 18 and 80 years (m=55±12) who experienced injury to the recurrent laryngeal nerve fol...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Development and comparison of two approaches for visual speech analysis with application to voice activity detection

نویسندگان

چکیده

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering

Speech and Reading Disorders Screening, and Problems in Structure and Function of Articulation Organs in Children in Mashhad City, Iran

Improving Voice Outcomes After Injury to the Recurrent Laryngeal Nerve

عنوان ژورنال:

اشتراک گذاری